Handheld portable optical scanner and method of using

ABSTRACT

A system and method for real-time or near-real time processing and post-processing of RGB-D image data using a handheld portable device and using the results for a variety of applications. The disclosure is based on the combination of off-the-shelf equipment (e.g. an RGB-D camera and a smartphone/tablet computer) in a self-contained unit capable of performing complex spatial reasoning tasks using highly optimized computer vision algorithms. New applications are disclosed using the instantaneous results obtained and the wireless connectivity of the host device for remote collaboration. One method includes steps of projecting a dot pattern from a light source onto a plurality of points on a scene, measuring distances to the points, and digitally reconstructing an image or images of the scene, such as a 3D view of the scene. A plurality of images may also be stitched together to re-position an orientation of the view of the scene.

CLAIM TO PRIORITY

This application claims priority to and is a continuation of co-pendingU.S. patent application Ser. No. 14/254,648, filed Apr. 16, 2014entitled “Handheld Portable Optical Scanner and Method of Using” whichis herein incorporated by reference and assigned to the assignee of thepresent application. U.S. patent application Ser. No. 14/254,648 is acontinuation-in-part of U.S. patent application Ser. No. 13/839,987,filed Mar. 15, 2013, issued as U.S. Pat. No. 9,332,243, and also claimspriority to, and the benefit of, Prov. Appl. 61/812,580, of the sametitle, filed on Apr. 16, 2013; both of which are hereby incorporated byreference in their entirety. U.S. patent application Ser. No. 13/839,987claims priority to, and the benefit of, Provisional Application61/715,223, filed Oct. 17, 2012, which is also incorporated by referencein its entirety.

FIELD OF THE DISCLOSURE

The field of this disclosure is that of a system and method forobtaining image data via a handheld portable optical imaging scanner andfor methods of processing the image and depth data via a variety ofmethods.

BACKGROUND

This technique finds its root in imaging systems.

SUMMARY

One embodiment is a handheld imaging system for capturing a multiplicityof images of a scene and determining a precise location of a pluralityof points in each image in a coordinate system, in real time. The systemincludes an image capture device for capturing and storing a pluralityof digital images, a depth computation system for deriving depthinformation for a plurality of arbitrary points in the images in realtime, a handheld computer system having a processor, a display with userinterface controls, and a communications module, wherein the processoris in communication with said image capture device and a depthcomputation system. The system also includes a non-transitory computerreadable medium having encoded thereon a set of instructions executableby the processor to cause the image capture device, the spot locationsystem and the communications module to perform one or more operations,the instructions including capturing a plurality of digital images indigital data of a scene and capturing a location for a plurality ofpoints in said scene from the image capture device and from the depthcomputation system, storing said digital data, combining the pluralityof images together to form a single 3D model, displaying the single 3Dmodel in the mobile handheld device, and manipulating the digital filedata of the single 3D model according user-based inputs to a processingand control system.

Another embodiment is a method for taking a plurality of images. Themethod includes steps of capturing a plurality of digital images indigital data of a scene and capturing a location for a plurality ofpoints in said scene from the image capture device and from a spotlocation system, storing said digital data, combining the plurality ofimages together to form a single 3D model, displaying the single 3Dmodel in the mobile handheld device, and manipulating the digital filedata according user-based inputs to a processing and control system.

Yet another embodiment is a handheld imaging system for capturing amultiplicity of images of a scene and determining a precise location ofa plurality of points in each image in a coordinate system, in realtime. The system includes an image capture device for capturing andstoring a plurality of digital images, a depth computation system forderiving depth information for a plurality of arbitrary points in theimages in real time, a handheld computer system having a processor, adisplay with user interface controls, and a communications module,wherein the processor is in communication with said image capture deviceand the spot location system. The system also includes a non-transitorycomputer readable medium having encoded thereon a set of instructionsexecutable by the processor to cause the image capture device, the spotlocation system and the communications module to perform one or moreoperations, the instructions for capturing a plurality of digital imagesin digital data of a scene and capturing a location for a plurality ofarbitrary points in said scene from the image capture device and fromthe spot location system, storing said digital data, combining theplurality of images together to form a single 3D model, and manipulatingthe digital file data according user-based inputs to a processing andcontrol system, wherein said manipulation step includes relating thecamera location and pose to a real-world coordinate reference system.

Another embodiment is a method. The method includes steps of capturingan image in digital data with a camera, storing digital data of theimage on a non-transitory computer readable medium, extracting grayvalues from image color channels of the image digital data, creatingimage pyramids from the grey values and from depth data of the digitaldata and computing a scene fitness value using the image pyramids. Themethod also includes steps of predicting a camera pose, aligning theimage with each element of a first subset of selected keyframes,yielding an aligned pose and also yielding a quantity of poses and acorresponding quantity of overlap values with respect to the selectedkeyframes, computing a new camera pose estimate using the aligned imageand the quantity of poses and the quantity of overlap values, andcreating a keyframe from the digital data of the image using the newcamera pose estimate when desired. The method also includes, after thekeyframe is created, selecting a second subset of keyframes differentfrom the first subset of keyframes and repeating the step of aligningwith each keyframe of the second subset of keyframes to yield aplurality of pose values and overlap values, deciding for each elementof the selected first subset of keyframes and the selected second subsetof keyframes whether new links are required to the keyframe in akeyframe pose graph, and linking the keyframes.

Other embodiments of the method include a method wherein the keyframe iscreated upon user command from the digital image data and the camerapose equal to an identity matrix if a set of existing keyframes is emptyor a camera pose equal to the estimated camera pose if the set ofexisting keyframes is non-empty. In another embodiment of this method,the digital image data used to create a keyframe comprises ared-green-blue- and depth-channel (RGB-D) image. Another step of themethod includes requesting and loading a new digital image from thecamera or from the non-transitory computer readable medium and using thenew digital image to repeat the steps of the previous paragraph. Usingthis method, one can also represent a 3D model through a collection ofspatially positioned and oriented keyframes. Another embodiment furtherincludes correcting the digital data of the stored image forcharacteristics of the camera according to a camera calibration. Inanother embodiment, the camera pose is predicted using rotational datafrom the group consisting of: a visual predictor based on image opticalflow; data from a digital gyroscope; a linear motion model; and acombination of these data sources. The method for representing a 3Dmodel may further include sending data for visualization of the digitalimages and of the 3D model to a graphics processing unit.

In another embodiment, after the step of computing the new aligned pose,the method may include checking for a failure in alignment and if thereis a failure in alignment, performing a re-localization procedure. Ifthere is no failure in alignment, the method may further includepreparing a real-time correctly oriented visualization in space of asaved 3D model for a user. In another embodiment, the method may furtherinclude computing estimates of overlap between the aligned image with aselection of keyframes to determine whether to create a new keyframe. Inaccomplishing this method, a decision to create a new keyframe isdetermined from data comprising overlap values between the aligned imageand a selection of keyframes and comprising digital depth data from theselection of keyframes. In another embodiment, the method furtherincludes reprojecting depth data from the digital image into a pluralityof keyframes, each reprojection using an estimate of a relative poseresulting from an alignment step of the depth data with the keyframe anda calibration model of the camera. This method may also includerecomputing depth data for a keyframe using a combination of reprojecteddepth data and the existing depth data of the keyframe.

In another embodiment, the method of representing a 3D model may includetransforming the 3D model from a local coordinate system to anothercoordinate system using digital images of physical targets acquired withthe image, wherein the target is selected from the group consisting of:a checkerboard target; a QR code or QR-like code; and a user-selectedpoint. This method may further include adding, updating, differencing orrefining geometry of a pre-existing 3D model with data acquired from thecamera using a spatial pose of a visible target for positioning andorientating the camera in space, the visible target being also presentin the 3D model, the 3D model selected from the group consisting of aplurality of digital images or keyframes, a spatial dataset and a CADfile. This method may also include a step in which a position and anorientation of the camera for the existing 3D model is determined basedon a localization procedure using an alignment algorithm that minimizesgeometric and photometric alignment error. In methods using a 3D model,the pre-existing 3D model may be loaded either from a digital storagemedium or from another computer accessible through a computer network.In another embodiment, the method may further include appending anadditional image to a set of linked keyframes, the additional imageselected from the group consisting of 3D data captured with the camera;3-D data captured by a laser scanner; and data from a 3D model or CADsoftware. In another embodiment in which images are represented with a3D model, the method may further include representing the 3D model in acompressed manner by compressing digital image data of keyframes into alossless format.

Another embodiment of the present disclosure is an apparatus. Theapparatus is useful for performing the methods described above, and forcollecting a plurality of spatially positioned and oriented keyframesand representing the collection of spatially positioned and orientedkeyframes as a 3D model. The apparatus may include an RGB-D camera forcapturing and storing a plurality of digital images, the RGB-D cameraincluding a depth computation system for deriving depth information fora plurality of arbitrary points in the digital images in real time. Theapparatus may also include a handheld computer system having aprocessor, a display with user interface controls, and a communicationsmodule, wherein the processor is in communication with the camera andthe depth computation system and a non-transitory computer readablemedium having encoded thereon a set of instructions executable by theprocessor to cause the RGB-D camera, the depth computation system, thehandheld computer system and the communications module to perform one ormore operations to gather the plurality of digital images and the depthinformation to form the 3D model, the 3D model suitable for presentationto a user on a user interface, the 3D model also suitable for appendingadditional data to update the 3D model.

Another embodiment of the present disclosure is a method. The methodincludes steps of capturing an image in digital data with a camera,storing the image on a non-transitory computer readable medium,correcting the stored image for characteristics of the camera, analyzingthe calibrated frame and extracting scene information from thecalibrated frame, determining a position and an orientation in space ofthe imager with respect to the image and a 3D reference frame andaligning the imager in accordance with the step of analyzing, preparinga 3D model with the image and the determined position and orientation inspace, the 3D model suitable for presentation and capturing anadditional image in digital data with the camera and adding data fromthe additional image to the 3D model. In another embodiment, the cameracomprises a Red-Green-Blue-Depth imager. In another embodiment, themethod further includes preprocessing the image before the step ofanalyzing. In another embodiment, the method includes a step ofanalyzing a quality of the position and orientation in space andadjusting the position and orientation in space. In yet anotherembodiment, the method includes generating a new 3D reference frameusing the image after the step of determining. The method may alsoinclude presenting the 3D model on a user interface.

Another embodiment of the present disclosure is also a method. Themethod includes step of capturing an image in digital data with acamera, storing data of the image on a non-transitory computer readablemedium, analyzing a calibrated frame and extracting scene informationfrom the calibrated frame, determining a position and an orientation inspace of the imager with respect to the image and a 3D reference frameand aligning the imager in accordance with the step of analyzing,preparing a 3D model with the image and the determined position andorientation in space, the 3D model suitable for presentation andcapturing an additional image in digital data with the camera and addingdata from the additional image to the 3D model. In another embodiment,the method includes using existing data as the 3D reference frame fororienting the image. In yet another embodiment, the method includesstoring the image on a non-transitory computer readable medium at aremote location.

These examples are not intended to be limiting, but rather illustrativeof the capabilities of our system.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosure and the following detailed description of certainembodiments thereof may be understood by reference to the followingfigures:

FIG. 1 depicts a system block diagram in an embodiment of the presentdisclosure.

FIGS. 2A-2B depict system block diagrams for embodiments of componentsof the present disclosure including an image capture and depth mapdeterminer.

FIG. 3 depicts a flowchart for operational use of the presentdisclosure.

FIGS. 4A-4D depict a series of flowcharts for the internal stepsperformed by the image capture operating system in deriving athree-dimensional representation of a scene.

FIG. 5 is a list of the various programs in several embodiments of thedisclosure.

FIG. 6 illustrates color/depth maps as seen from keyframes useful inrepresenting 3-D scenes using 2-D image maps.

FIG. 7 illustrates keyframe representations taken from different pointsof view in a scene.

FIG. 8 illustrates a file structure format in which the file headercontains general scene information.

DETAILED DESCRIPTION OF THE DISCLOSURE

The following is a written description of the present disclosure, and ofthe manner and process of making and using it, in such full, clear,concise, and exact terms as to enable any person skilled in the art towhich it pertains, or with which it is most nearly connected, to makeand use the same, and sets forth the best mode contemplated by theinventors of carrying out the disclosure.

Image capture systems for use in documenting the as-built condition of astructure or the shape and contours of an object are well-known in thearts. However, most do not operate in real time and are not configuredfor intensive image manipulation in the field. This disclosure includesa portable, handheld 3D image capturing system that enables creation ofready-to-use 3D models in seconds. The system of the present disclosureprovides high resolution, real-time 3D images with high frame rate ofcapture. Thus, it is capable of capturing scenes with moving parts, orwhen the image capture device itself is in motion. High precisionlocation of objects and features are provided, with millimeter andsub-millimeter accuracy.

The workflow for operation and integration into the daily set of tasksinvolved in 3D image capture is streamlined and simplified, thusimproving user productivity. The 3D image capture system is small,light-weight and extremely easy to use. A user can have a 3D model of ascene stored in memory and displayed on a local screen in a few seconds.Examples of typical workflows are shown in FIGS. 4A-4D.

The main components of the product are shown in FIG. 1, System Blockdiagram 100. The user desires to capture a scene of interest 101. TheImage Capture and Depth Determiner 102 is used for capturing multipleimages of a scene along with a depth map for each scene. A handheldtablet or smartphone 110 is used for implementing a 3D rendering systemand operating the complete system. The tablet or smartphone isconfigured to enable a user to operate the image capture device, toobtain a data stream of images with depth map information for the image,which may include depth information for pre-determined spots in theimage, and to perform any of a number of image manipulations based onadditional software available to the tablet/smartphone computer. Thehandheld computer has programs for its internal operations andapplications programs for managing the image capture and variousprocessing tasks. Software for improved image processing 140 includesprograms that can capture and deliver image and depth information,combine multiple images into a single 3D model for viewing, analyzing,and managing. Software for image and model manipulation and managementin real time. 160 is another main embodiment of the disclosure.Additional software is available for performing many other functions onthe captured and combined images, for a variety of purposes. Thefeatures and functions of each of these components are next described indetail.

Image Capture and Depth Map Determiner 102

In an embodiment, the 3D imager comprises a Red-Green-Blue-Depth (RBD-D)camera as the principal sensor, operated by and with a tablet computeror a smartphone. The camera is configured to operate with aStructured-Light depth map determination system. Alternatively, thecamera may be configured to operate with a Time-of-Flight depth mapgenerator. Sensors provide depth map data in real time using inexpensivecomponents.

Structured Light depth map imaging systems are available fromPrimeSense, an Israeli company which supplies components to theMicrosoft Kinects system for detecting a user's body parts position andmovements, as part of their product offerings. More information may befound at www.primesense.com. A specialized infrared light beam system103 broadcasts a dot pattern toward a scene of interest, delivering aseries of tiny dots that illuminate the objects in the scene. The numberand spacing of the dots defines the resolution obtainable. An imager104, similar to what is found in digital cameras, captures the scene andthe illuminating dots in a special sensor, called the PS1080. Imager 104includes an image capture lens and a CMOS sensor. The sensor derives asynchronized depth image, a color image, and optionally an audio datastream.

See FIGS. 2A-2B for additional details. FIG. 2A depicts a block diagramof the components of a PrimeSense image capture and depth map determiner200. Components include a microprocessor 201, with an IR lighttransmitting capability 203 and a depth-determining CMOS functionality205. If audio is desired, an audio section 208 may include one or moremicrophones 209 and one or more, possibly several, audio sources 211,for generating and detecting sound to accompany the image or imagestaken. The system may also include separate memory 213 or portablememory 215, such as the USB flash drive shown. The depth map is createdin real time for each captured image. Module 102 may also include acustom processor 105, which may include a control portion 106 andstorage 107 for color and depth information. FIG. 2B depicts aPrimeSense image capture and depth map determiner system 230. The systemincludes a light source 231 and light detector 233 to illuminate targetof interest 235 and detect light reflected from the target 235. Thesystem control 237 may include a microprocessor 239 with its own memory241 and input/output systems 243.

Similarly, fully integrated sensors 108 for performing Time-of-Flight(TOF) distance measurements without any moving parts are available fromthe PMD Technologies Co. in Siegen, Germany. More information aboutthese systems may be found at www.pmdtec.com. The sensor generates amodulated optical signal, and measures time of flight directly.

For example, the PMD PhotonICs 19k-S3 chipset obtains distancemeasurements to each pixel instantly, thus providing both a 3Drepresentation of each pixel in view in the scene, as well as grey scaleillumination data, simultaneously. Data from the chipset may be read outat rates of 15 MPixels/second. PMDTech also offers a complete camerasystem called the CamBoard, which is the first USB powered single board3D TOF camera. Other companies with similar products include SoftKineticand MESA Imaging. The capture rate for these sensors permits image framecapture at rates up to 60 frames/second (fps). These sensors do notprovide the same level of resolution that more complicated and moreexpensive scanners can provide. However, with the combining systememployed in various embodiments of the disclosure, many of thelimitations are overcome.

Multiple image capture devices may be used and their data streamsdelivered to the handheld/tablet or smartphone computer device. Imagecapture devices from alternate suppliers may be employed to deliverimage data as well. For example, robots carrying imagers can be employedin hard-to-reach places such as tunnels or sewer systems.

Handheld Tablet/Smartphone for Implementing a 3D Rendering System andOperating the Complete System

In an embodiment of the present disclosure, the tablet computer,handheld computer, or smartphone shown at 110 serves as the userinterface for controlling the image sensor and depth capture sensorsubsystem 102. The tablet computer may be any of the products offered onthe market such as the iPad by Apple Computer, the Galaxy III bySamsung, and many others. Similarly, an embodiment of the presentdisclosure may be realized with a smartphone such as an iPhone, offeredby Apple Computer, or the Galaxy family of smartphones offered bySamsung, or various Android phones offered by the HTC company of Taiwanor the Razr offer by Motorola. All of these products contain anoperating system 130 configured to run and manage the tablet itself, andto implement a host of applications such as those in embodiments of thepresent disclosure.

The essential elements of a handheld computer are the ability to operateit while holding it in one or two hands, without any additional support;to be able to see the resultant two-dimensional (2D) image as capturedby the image/depth capture module 102, on a display 116; and to be ableto input control information and commands via either a touch screen(also at 116) or an optional keyboard at 117. An audio output 118 isalso desirable, if not absolutely necessary. The processor 111 availablein current tablet computers has suitably fast clock operations, greaterthan 1.0-1.4 GHz, to facilitate real time operation of the image/depthcapture system and process the image and depth data, to provide avisible image in near-real to real time. Additional features andfunctions common in most if not all of such handheld computers availabletoday and connected on bus 112 may include a second internal camera 113,a communications system 114 further comprising at least one of acellular telephony link, a cellular data link, and a Wi-Fi link.

Software such as Operating System 130 contains applications foroperating these accessory functions, along with data management andstorage in ROM 119, RAM 120, and Data storage 121, which may comprise anexternal memory device like a USB memory stick, or any other suitablenon-volatile storage medium. Besides the operating system, software mayinclude image processing software suite 140, image and data managementsoftware suite 160, and a suite of software for imager calibration 190.As outlined below, each of these may include a variety of separateprograms. In an embodiment of the present disclosure, audio capture viathe custom processor 105 and audio playback via software in theoperating system 130 enable capture and playback of sounds during imagecapture as well. This feature facilitates verbal note-taking whileperforming the image data capture if so desired. While the computer maybe handheld, a local positioning system 115 or aiming system may also beused

Software for Image Capture and Rendering to Form a 3D Data Set

A number of software programs useful in the present disclosure arelisted in FIG. 5. In an embodiment, image processing software 140 isprovided for using a stream of RGB-D video frames to form the combined3D data set. These include Program 141, Image Capture andPre-processing, one of the group of applications 140, the ComputerVision and Scanning suite. For capturing and rendering, the suiteincludes a real-time RGB-D image visualization program, Program 142shown in FIG. 5 as part of Image Processing suite 140. The software maybe configured to operate on a portable handheld device like a tabletcomputer or a smartphone.

In an embodiment, new stitching or combining software is used toautomatically merge two or more images together to form a composite 3Dmodel. With this software tool, a model may be created from one or moreimages taken from different viewpoints in the scene. The result formsthe basis for creating a panoramic image. This process is done inreal-time, on-the-fly, so that the user can, at any time, view theformed 3D model, even during capturing, from a variety of viewpoints.This includes the current viewpoint of the attached camera, resulting inan Augmented-Reality-style visualization. The instant 3D model formationenables a user to see exactly where additional data points might betaken, and enables the user to point the camera to the desired region inneed of more detail. In other words, holes in the image of the scene canbe fixed on the fly. Additionally, the quality of the data in the scenecan be assessed, and additional images from different viewpoints can beobtained as needed.

Elements of the software include suggestions for user-initiated actionsto complete a portion of a scanned image, including directions to aimthe image capture device. Because of the power of the combiningalgorithms used, including the capability of 3-D reconstruction, imagesobtained from other instruments with differing levels of detail may alsobe inputted into the tablet or smartphone computer system. The advantageof fast combining or modeling means that field adjustments and retakescan be done in near real-time with these other instruments as well aswith the instant system. Image capture devices which also produce 3Ddepth maps along with greyscale or color images, such as those built byTrimble Navigation Limited, FARO, Z+F, and so forth may be inputted tothis system.

The software provides an integrity metric to indicate when there is notenough data to perform a decent combining or 3-D modeling operation orto obtain registration of particular image with a previously declaredregistration point in the image. The declared registration point may beobtained from an arbitrary model, either from a Computer-Aided Design(CAD) model or a 3D point cloud model. The user interface is changed ina way that the user sees or is notified where there is not enough datacaptured in the scene as the scene is being combined or modeled.

In an embodiment, in Program 143 for example, the software is configuredfor real-time alignment of 3D derived data with an RGB image, thusputting a high-resolution photo image into spatial context with the 3Dderived spatial data. In another embodiment, the software in Program 144is configured to enable a user to compare the collected or imaged datato the RGB frame, showing the difference in a way or method that showsthe user, on the User Interface (UI} where data does not match the RGBposition. This may be due to an inability of the ranging system toextract the distance for each illumination dot in the image frame, basedon the color, reflection or other environmental condition of the scene.This may be due to a variety of causes, such as lack of surfaceillumination, too much illumination, ripping or tearing of the surfaceedge, or a need for more image data. This may be done in real time withthe results displayed and made available, for example, in an augmentedreality (AR) situation. Program 145 includes additional capabilities forprocessing and registering images in post-processing operations, withcapabilities for real-time results and displays with AR applications.

Software for Image Manipulation and Management in Real Time

A suite of software programs 160 are available to the user forperforming a number of different operations on the captured and/orprocessed images with the associated 3D information. In one embodiment,a 3D Modeler software algorithm, Program 161 processes a real-timeRGB-D, a range or depth map data stream on the handheld computer system,to create a 3D model of the recorded scene, as the user captures thedata. The frames or a group of frames are used to reconstruct the sceneas the device is moving through the scene. In contrast to the disclosuredescribed in points 27 to 29, this point describes a formed 3d modelwith basic geometric primitives (polygons, planes, cylinders, boxes,etc. as used in common CAD systems) as opposed to having individual 3dpoints. In constructing the 3D-primitive model, not all points may beused from each frame, but the best points are selected, which mayinclude all the points in the image as a reference for the stitching orregistration from frame to frame or used when geo-referencing in otherdata as a registration anchor. The 3D Modeler program also may add itsdata to the RGB image data in a seamless combination. The 3D Modelerprogram may add its data to a pointcloud 3D model, or to a panoramicstitched image, or to both.

In program 162 (suite 160) an existing 3D CAD model or 3D point cloudmodel is obtained and displayed in a manner relating to the currentscene capture. For example a virtual model can be registered with thejust-captured, processed, and fused 3D model. For visualization avirtual camera is used that can take on any desired viewpoint. If thecurrent viewpoint of the camera during capturing is used forvisualization, the visualization mode is generally referred to asAugmented Reality (AR). The existing model may be downloaded via thecommunication link from a remote library or storage facility, as may befound in Cloud storage. In another embodiment, new data may be capturedand added to an existing model. The user can select one or more points,or groups of points by selecting from the data of the scene as shown onthe User Interface; alternatively, the user may select a pre-determinedset of range data. The user can define known values in the data,example, a sphere or target of some known type that has either a currentgeo-referenced position or a point, group of points, or a derivedlocation from a set of points into a known transform.

In another embodiment, software algorithms are provided to enable a userto extract known shapes or a particular geometry, such as Program 163 insuite 160, from a captured image of a scene, and to export a definitionthat allows the shape to be reconstructed on another remote device, likea desktop computer, or another smartphone or tablet. Alternatively, theextracted and modeled shape can be stored in a remote storage facility,and used in another program that places the object in a specifiedgeo-referenced model of the current scene.

In another embodiment of an available software algorithm, Program 164 insuite 160, a user may operate a program that is configured to determineand visually display differences between the scanned captured model anda pre-existing model. This is particularly useful for comparing theas-built condition of a structure or object with the desired designspecification.

In an embodiment of an available software algorithm, program 165, a usermay operate a program configured to perform a transform of captured datafor a particular scene to an externally provided model of a desireddesign for such a scene. With this program, the newly captured scenedata may be matched to the design coordinate reference frame.

In another embodiment of an available software algorithm, a user mayoperate a program configured to perform real-time data streaming via acommunications link, Program 166, to a remote storage facility, or to aremote computer for display and manipulation by another person. Thisoperation enables joint sharing of instant image data, for improvedworkflow involving making changes, taking new image capture operations,and sharing observations about the results. The communications link maybe configured to include voice communications as well as the image datacommunications. This type of communications-based image/operationalinformation sharing enables a central manager to supervise and reviewone or more remote data collection operations in real time ornear-real-time. Additional features include the ability to direct datatransfers from and to other image capture devices as may be associatedwith a given handheld computer system.

In another embodiment of an available program, Program 167, one or morebasic RGB-D images may be transmitted directly without performing anintegration of the RGB-D frames into a 3D model in the capturingcomputer. Instead the model creation may be carried out remotely at aCloud-based server and made available to other interested parties viacloud access. This transmission and conversion process may be done inreal time as the data is collected. Alternatively it may be done in apost-processed operation, and any or all of the data may be extractedfrom a storage facility, locally on the tablet, or stored in a remotestorage facility such as a cloud-based service, and manipulated in aremote location by another interested party.

In addition, in an embodiment, one or more basic RGB-D frames may becompressed and streamed to a remote location for storage and furtherprocessing, as described above. In another embodiment, the program isconfigured to enable the user to select individual frames fortransmission to the remote facility for storage or viewing andmanipulation. In yet another embodiment, a program 168 is available forproviding a registration geo-reference point to incorporate and match toa selected location point in a captured image.

In another embodiment, an available program 169 is configured to extendand fill in an existing 3D model with newly recorded 3D data. The newdata is manipulated by the software algorithm so that it blendsseamlessly with the pre-existing data. In another embodiment, a program170 is available to extract surface angles from captured RGB-D imageryin real-time and to provide immediate visualization of the surfaceangles. The program is further configured to create an augmented-reality(AR) form for the display of the angles.

The Handheld Portable Computer: Tablet, Smartphone, or Notebook

The handheld computer as described above may comprise a tablet computerof the kind available from Apple Computer, ASUS, Samsung, Blackberry,Microsoft, and the like. The handheld computer may comprise a Smartphoneof the type offered by Apple Computer, Samsung, Nokia, HTC, Blackberry,and the like. The handheld computer may comprise a Notebook type ofportable computer with a suitable form factor for handheld operation andmanipulation, such as provided by ASUS, Sharp, HP, Dell, and the like.The handheld computer may be configured to record and display data fromone or more image capture devices, sequentially or simultaneously.

A display software program 171 is available to provide one or moregraphical user interfaces for operating the various programs citedpreviously. Graphical user interfaces (GUIs) may be embedded in each ofthe operating programs. Housekeeping functions such as changing a viewpoint for a model, trimming or extending the data, converting betweenformats, seeking and displaying additional information, runningsimulations, are included. Program 172 is configured to provide andmanage on-screen images for Post-Capture visualization.

Program suite 190 is configured to provide a calibration suite ofprograms for calibrating the imager. It contains Program 191 forcalibration of projective images, and Program 192 for calibration ofdepth for structured light systems.

Flow Charts

FIG. 3 depicts a flow chart 300 demonstrating a workflow for multipleoperations of the imager/computer system, in a typical field operation.The real-time capture and manipulation made possible by the combinationof fast hardware and fast software make all the steps recited in FIG. 3possible. The start of the work flow begins, in this example, with anRGB/Image capable tablet computer, or other suitable digital imager. Ifthe scene of interest is viewed 301 remotely, the image frame(s) arestreamed 303 to a remote system. If the remote is a cloud server 305,the scene or scenes are combined 313, using an algorithm from a remotetablet or computer on the cloud server. If the image is not being viewed311 in real time, e.g., the user is working with stored data, still witha remote system 311, the scenes or images are combined 313, as noted,using the algorithm from a tablet or other suitable computer on thecloud server. The images may then be used as is or subjected to furtherpost-processing. If the image is being viewed in real time 307, a frametool is used that allows the user to pull frames out of the data,pulling out the desired number of frames to make a model or point cloud309 for the desired image or images. The images preferably conform toindustry standard formats 315. If the images do conform, the data may besaved using 317 industry-standard point-cloud formats for the images. Asnoted in FIG. 3, these may include a number of engineering/constructionformats 318, entertainment formats 320, such as gaming formats, orsecurity formats 322, such as those for law enforcement or the military.

If the user, on the other hand, is present at the scene, then frames orimages are combined in real time 321 and 3-D model is created and storedlocally or on a remote drive. If additional processing is desired, suchas for creating composite images or for manipulating the images, a checkmay be made 323 as to whether data exists, e.g., position or locationdata, that would allow registration of the image or images, such as toallow stitching. If not, the scene or image is saved as is 325. If dataexists that would allow registration, then one or more existing images,scenes, points or CAD data is used as a geo-reference system 327. Thedata is then aligned or registered to the base or anchor data in realtime by imaging over the existing data to define a transform 329, andthe data is then saved 325. Reference systems include but are notlimited to: GPS, WGS-84 and NAD83. Reference systems may also includelocal northing and easting, such as from a county reference system, andmay also include any convenient datum, such as a local on-site referencespot, like the cornerstone of a new building, the pupils of a person'seyes, and the like.

Additional flowcharts for image processing are also detailed in FIGS.4A-4C. In FIG. 4A, steps are disclosed for internal operations for imagecapture and processing. The process 400 for FIG. 4A includes a firststep 401 in which a new RGB-D frame has been taken and is available forprocessing. The depth channel may be corrected 402 according to thecamera calibration. Grey values are created or extracted 403 from theimage color channels and image pyramids are created 404 for both thegrey channel and the depth channel. In the next step, the structure ofthe current RGB-D frame is analyzed 405, using a coarser pyramid valueif speed is desired. A scene fitness value is computed that correspondsto the condition number of the covariance matrix computed by aligningthe frame against itself under 6-dof pose movement.

The fitness value describes the system's ability to perform real-timeframe alignment using the given RGB-D frame. The current camera pose ispredicted 406 using the final estimated pose from the last iteration.Also used in predicting the current camera pose is a visual predictor,data from a system gyroscope or a linear motion model. A pose is acamera position and orientation in space, typically described by a3-vector for its translation and a 3×3 orthonormal matrix for itsrotation. Then, a set S of N existing keyframes is selected 407 foralignment with the estimation of the current camera pose. S isdetermined by a breadth-first search in the keyframe graph starting withthe current active keyframe. The next steps are taken in parallel, withdata for visualization and noise reduction (including RGB-D for thecurrent frame) uploaded 408 to the graphics processor unit (GPU). At thesame time, the current RGB-D frame is aligned 409 to each of thekeyframes in the selected set S, using the predicted pose P as astarting point. The alignment step minimizes geometrical andphotometrical alignment error between a frame pair over 6-dof posevariation following the motion of points under lie algebra se(3). Theresult of this step is the desired number (N) of pose updates, one foreach keyframe in the set, and the same number, N, of overlap valuesbetween the current RGB-D frame and the particular keyframe.

Depending on host device capabilities two optional intermediate stepsmay be performed before step 409 to improve the quality of the camerapose estimate and reduce pose-drift throughout the entire capture. Asshown in FIG. 4B, step 408A, a predicted RGB-D frame F is rendered onthe GPU, using RGB-D data of the keyframes in S and the pose of thecurrent active keyframe and the relative keyframe poses in S withrespect to the active keyframe. The pose estimate P is then updated orimproved using dense alignment of the current RGB-D frame against F. Theexecution of the steps of 408A depends on the availability of suitablecomputer languages for the host device GPU. The process then continuesto step 409, as discussed above, and then to step 410. From the bestavailable camera pose estimate P and a set of N computed individualper-keyframe relative poses, a new camera pose estimate P+ is computedusing a weighted average of all the input poses.

A third flowchart 460 for another part of the method for image captureand image processing is depicted in FIG. 4C. After P+ is computed,alignment metrics are analyzed 461, including a normalized geometricalRMS alignment error, the number of occluded points and the overlapvalue. Alignment metrics from all the previous alignment steps areanalyzed and a re-localization procedure is entered 465 if the analysissuggests an alignment failure. In that case, none of the following stepsare performed.

If the alignment is good the keyframe in S with the highest amount ofoverlap with respect to the current frame becomes the new activekeyframe. If certain conditions or criteria are met 462, a new keyframeis created 463 as desired from the current RGB-D data and the currentestimated pose, and is included in the model and becomes the activekeyframe. The criteria are based on the computed N overlap values, thecurrent camera pose and the poses of the N keyframes. If the criterianot met, the process goes directly to steps 467, 469 in parallel, asdiscussed below. The criteria may include a number of requirements. Oneuseful criteria includes a predicted overlap of a first frame A andsecond frame B. A high amount of matching visual or geometricalfeatures, or both, are extracted from frame A and frame B. The matchesshould be consistent with a relative pose between A and B.

If the criteria are met, a new keyframe is created 463 a set T of Mkeyframes is selected from the model based on criteria of step 462. Step409 is performed on T and each keyframe in T that yields a high overlapvalue will be linked 464 to the new keyframe with its relative pose inthe keyframe graph. Every keyframe of set S that has yielded goodalignment results will be additionally linked to the new keyframe in thekeyframe-graph Next, two steps should take place in parallel. A newRGB-D frame is requested and loaded 467 from the sensor for the nextframe. In addition, an augmented-reality visualization is drawn 469using the current estimated camera pose and the spatial scene data thathas been uploaded to the GPU. Subsequently, on the GPU, the currentframe depth data is reprojected 471 into teach of the N selectedkeyframes, using the current estimated pose and the camera calibrationmodel. The depth data of the keyframes is recomputed incorporating thenew measured depth data from the current frame.

A fourth flowchart is depicted in FIG. 4D. This is an abbreviatedprocess for steps for image capture and image processing. A first stepis to capture 481 a new RGB-D frame using the imager. The captured RGB-Dframe is pre-processed 482 according to the particular imagercalibration. The calibrated RGB-D frame is then analyzed 483 and generalscene information is extracted. The frame is then used to determine 484the current camera pose or orientation with respect to one or more RGB-Dreference frames, that is, to align the frame. The results of thealignment are then used to analyze 485 the quality of the pose estimate.On system request, the estimated pose and the current RGB-D frame arethen used to precision align 486 existing reference frames, and a newreference frame is then generated 487 from the current RGB-D frame. Theestimated pose and the current RGB-D frame are then used to extend,improve and/or sculpt 488 the existing 3D model. The user interface onthe screen is then updated 488 with the newly computed results. Theresults may be used for optional augmented reality style visualizationwith suitable equipment. In addition, the process may be repeated 490 asoften as desired for better alignment.

Closeup Utility for Human Physiognomy

In an embodiment, the integrated 3D imager can be used to capture andprovide measurements of human or animal physiognomy. Measurements foreyeglasses and contacts can easily be obtained with millimeter accuracy.Detailed maps of the head, the cornea, eyes, ears, and the like may becaptured in a few seconds. Similarly, 3D imagery of other body parts maybe obtained, for use in making prosthetics, or for use by plasticsurgeons in creating models for making adjustments to one's physiognomyor for providing surgical repair for accidents or other damages.

As an example of a typical operation, one may first measure the eyes andnose of a person. From that information, the separation between eyes,the interpupillary distance, can be found. The shape and size of thenose can be found. The location and size of the ears relative to thelocation of the eyes and nose can be found, including the distances, soa pair of eyeglass temples can be specified. Models of eyeglass framesmay be selected by a buyer from a catalog. Digitally stored 3D models ofthe frames can be overlaid in the image to check for fit and to see ifthey suit the buyer. Such a service could be an aid to selling eye careproducts. In another embodiment, the image of the person's face can beinverted so that the person sees what he would see in a mirror.

Scene Compression, Efficient Scene Storage/Compression in Binary DataFormat

A classic problem with 3D point cloud data is large resulting file sizeswhen stored in an uncompressed manner. Efficiently compressing pointclouds however is only possible when there is structure in the data thatcan be used to extract and (and compress) redundancy in the data.

Since the software in the main invention uses “Keyframes” to representthe 3D scene and to model the necessary structure for compression, it isgiven in the form of 2D image maps. Keyframes are regular 2-dimensionalRGB color images taken from different viewpoints in the scene. Everykeyframe also has a depth map attached to it that carries depthinformation for each pixel. Depth is defined as distance of a point tothe camera center along the camera optical axis. Depth maps and colorimages are registered so that for each pixel in the color image itsdepth can be looked up at the corresponding pixel in the depth image. Inaddition each keyframe carries the camera “pose” (extrinsic) informationthat encodes the camera position and viewing angles in a matrix and thecamera internal (intrinsic) parameters (optical center, field of view,radial distortions). The 3D position of a pixel in a keyframe can berecovered by taking into account the depth of the pixel and the cameraextrinsic and intrinsic parameters. See FIGS. 6 and 7.

Since the scene information is represented by a set of 2-dimensionalimage/depth maps, it is suited for compression by traditional imageencoding techniques like JPG or PNG. For encoding the depth maps alossless format (like PNG) should be chosen to avoid geometric error inthe recovered model. Following that approach, a scene can be stored as afile comprising a general “header” section. The header may include amongother things, meta-information about the scene, the data capture processand the global coordinate transformation. This may be followed bysections for each keyframe, each keyframe section storing, among otherthings, the keyframe camera extrinsic and intrinsic parameters and theircolor image and depth maps, encoded in an appropriate format. See FIG. 8for examples.

In addition to RGB- and depth-maps several other images may optionallybe stored for each keyframe, including masks and confidence maps.

Semi-Automated Transform Determination Using Artificial Targets

When a scene or object is scanned in 3D the acquired data lives in alocal coordinate system as determined by the used sensor until it isdetermined where the data is located in a “global” coordinate system(which could be a true global coordinate system in terms oflatitude/longitude/height or any project-specific local coordinatesystem). This coordinate transformation from local to global can bedescribed by a euclidean- or similarity-transform in 3 dimensions anddetermining the exact parameters of that transform (thus putting thedata into its global context) is essential for many real-worldapplications (such as BIM or Augmented Reality).

One method of finding the right transformation is establishingcorrespondences between world points and local points and then solvingfor the transformation that aligns these points. Methods forestablishing correspondence are usually based on natural geometricfeatures (corners, edges, planes, etc.), photometric features (surfacetexture or salient points) or artificial features such as specificphysical target points in the scene for which the real-world location isknown. Obtaining the transformation from the correspondences is usuallyformulated as a linear or gradient-based minimization problem and asolution is found using robustified linear solvers.

In order to obtain the desired coordinate transform the disclosedinvention makes use of artificial features (in the form of physicalscene targets for which the global position is known) for correspondencecreation. The artificial features currently in use are “checkerboard”targets which are four black and white squares (2 black and 2 white)laid out in a checkerboard fashion and printed on a carrier material(such as paper). Correspondences are created by first designatingpotential checkerboard features in the scene model and then identifyingeach potential checkerboard feature with a corresponding point in theglobal reference set (or discarding the detected candidate feature if itis not a physical target or not contained in the reference set). Themethod currently in use for identifying checkerboard targets in thescene consists of a cascade of checks, each check removing candidatesfrom the set of potential checkerboard features for a given sensorinput. The cascade starts with all points in the current sensor inputbeing potential checkerboard features. The first check mechanism in thecascade is based on the “Chess Corner Detector” (published here:http://goo.gl/mY10U). The second mechanism removes non-maximal pointsfrom the resulting set (in terms of “Chess score”). The third mechanismfits a 2D checkerboard image to the candidate point using ESM(http://ijr.sagepub.com/content/26/7/661.short) and an affinetransformation model and the resulting error from a binarized fit andthe translational drift is used to reject candidates. The fourth stagefits two straight lines to the rectified fit (starting with onehorizontal and one vertical line) and rejects based on photometric errorand geometric fit deviation (from horizontal and vertical).

Using the set of detected potential target features the identificationof features with points in the reference database happens through userinput. The user will thereby select a sub-selection of the current scene(which is presented to him on the device screen) and then tap on atarget feature. The system will look for the closest detected targetcandidate in the set of candidates and “snap in” the user selection tothe closest such candidate. For the selected candidate the user thenselects a corresponding target point from the reference list of actualtargets. At least three such associations are necessary to start anautomated fitting process that obtains the desired transformation usingthe correspondences. In the fitting process the system automaticallyassociates other target candidates with reference features based onproximity using the current best estimate of the transform data.Likewise it rejects false candidates and matches based on the samemetric. The transform fitting process uses a robustified linearizedestimator that outputs 6 parameters of an updating transformation to aeuclidean transformation. The initial estimate of the euclideantransform is given by aligning the centroids of the associated detectedand reference sets and then solving directly for the remaining rotation.

Instead of simple checkerboard targets the system may use QR codes orso-called April-tags in future releases. Using QR codes or April-tagscomes with the advantage of being able to associate meta-data with eachtarget which can be used to store project data at a particular spatiallocation.

Target-Assisted Global Map-Optimization and Loop-Closure

User-identified or automatically identified targets can assist thesystem in performing statistical global 3D-model optimization. Theidentification of targets across keyframes in the model yields a hardconstraint for the global optimization procedure in the sense that twoscene points (as marked by a target in separate keyframes) must be thesame point in global 3D space. These constraints can help enable morerobust global optimization in the case of heavily misplaced input data(such as when very large loops and/or distortions are present in theinput data) and improve model quality after global optimization.

Instead of physical targets the user may also designate and identifyscene points “by hand” by means of an appropriate visual user interfacethat allows for the precise selection of points. The user mayadditionally specify the desired usage of the designated scene points(for assisting loop-closure, for improving model quality, etc.) as somepoints may only be appropriate for some but not all usages.

Scan-Appending Capabilities

The system disclosed herein is capable of appending data to a previouslyscanned area using automated localization. The user can load an existingscan into memory and select an area that he wants to start appending to.The system then goes into “localization” mode in which it attempts toidentify the current sensor input (the user pointing the device to thedesired area) with the selected area in the existing scan. Theidentification process is based on the regular keyframe alignmentprocess as disclosed above with the difference of using a coarser-scaleresolution as the bottom level in the “coarse-to-fine” alignment schemein order to aid convergence from misaligned viewpoints. Once a suitablealignment between the current camera RGB-D frame and the desiredkeyframe is found (based on an error metric consisting of geometricerror and general scene geometric attributes), the system will switchback to regular scan operation and append to the scene/model as inregular scan operation.

The described append functionality is capable of appending to a varietyof existing 3D input data including but not limited to a) existing 3Ddata captured with the disclosed device, b) 3D data captured by a laserscanner, c) 3D data as created by 3D-modeling or CAD software.

Target Detection and Identification

The user identifies targets after capture by tapping/clicking on animaged target displayed in a keyframe's RGB data on the host devicescreen and entering the target ID. The system can automatically try toidentify and position April-tags or other QR-code or bar-code-liketargets in the RGB-D stream available to the device upon user command.The user command is usually a simple button press upon which the systemtries the described identification. Detected targets, codes, tags etc.are visualized to the user during scene capture by projecting their 3Dposition into 2D screen coordinates (using the sensor pose and itscamera intrinsic parameters) and highlighting the area around theprojected target point(s). The system can also try to identify andposition April-tags or other QR-code or bar-code-like targets aftercapture in the 3D model by doing detection, identification andpositioning on the individual keyframes instead of the current RGB-Dframe. Detected/Identified targets are visualized either as highlightedareas in the rendering of the 3D point cloud or as highlighted areas inthe keyframe images.

Auto-Targeting on a Tablet or Mobile Device

The system disclosed herein includes an ability in real or near realtime to stitch spatial data based on an existing survey control network,as data is being collected or used as real-time validation andGeo-referencing. This includes the use of known survey targets orcontrol in a scene, allowing the device to auto-locate or geo-referenceto a reference coordinate system. The use of known survey targets orcontrol system, and a method of displaying the fit of the network as thedata is geo-referenced into the network. A plane or control point isplaced on the first target, allowing the user to accept or decline thepoint, as the second point is captured, the target or control locationis updated, locating the captured data into the correct geo-referencednetwork. Once the third control point is located, the data is fit to theknown points, and a value of the fit is applied to the scene, allowingthe user to accept by continuing to fit more control points to thesolution until he data meets the project requirements, by addingadditional control points or removing points.

The disclosed system includes the ability to use targets or controlpoints without identification of labels or attributes, by using only theposition of the target and the spatial offset between targets, and theirangle to match existing points to imaged targets present in real-timeRGB-D stream available to the tablet or mobile device. This includes theability to search an image or spatial dataset for a group of points,based on shapes represented by CAD objects, allowing the spatial data tobe transformed into a geo-referenced coordinate plane as the data iscollected. The system thus has the ability to load an existing scan orimage into a tablet or mobile device, and to append to the originaldataset or a group of data sets, and append new data and transform thenewly collected data to the original dataset, while not using targets orcontrol. The system thus has the ability to fit CAD shapes on a tabletor mobile device, using a ranging sensor and a camera to fit new CADshapes to the existing data.

Real-Time in-the-Field Differencing

The disclosed system is capable of performing functionality based on thedifferencing of existing 3D data versus existing conditions. Thedifferencing ability makes use of the system real-time append,(re-)localization and 3D alignment abilities to achieve registrationbetween existing 3D data and 3D (or RGB-D) data captured by the device.

General Principles

While only a few embodiments of the present disclosure have been shownand described, it will be obvious to those skilled in the art that manychanges and modifications may be made thereunto without departing fromthe spirit and scope of the present disclosure as described in thefollowing claims. All patent applications and patents, both foreign anddomestic, and all other publications referenced herein are incorporatedherein in their entireties to the full extent permitted by law. Whilethe disclosure has been described in connection with certain preferredembodiments, other embodiments would be understood by one of ordinaryskill in the art and are encompassed herein.

The methods and systems described herein may be deployed in part or inwhole through a machine that executes computer software, program codes,and/or instructions on a processor. The present disclosure may beimplemented as a method on the machine, as a system or apparatus as partof or in relation to the machine, or as a computer program productembodied in a computer readable medium executing on one or more of themachines. The processor may be part of a server, client, networkinfrastructure, mobile computing platform, stationary computingplatform, or other computing platform. A processor may be any kind ofcomputational or processing device capable of executing programinstructions, codes, binary instructions and the like. The processor maybe or include a signal processor, digital processor, embedded processor,microprocessor or any variant such as a co-processor (math co-processor,graphic coprocessor, communication co-processor and the like) and thelike that may directly or indirectly facilitate execution of programcode or program instructions stored thereon. In addition, the processormay enable execution of multiple programs, threads, and codes.

If more than one processing core is available, the threads may beexecuted simultaneously to enhance the performance of the processor andto facilitate simultaneous operations of the application. By way ofimplementation, methods, program codes, program instructions and thelike described herein may be implemented in one or more thread. Thethread may spawn other threads that may have assigned prioritiesassociated with them; the processor may execute these threads based onpriority or any other order based on instructions provided in theprogram code. The processor may include memory that stores methods,codes, instructions and programs, non-transitory data, as describedherein and elsewhere. The processor may access a storage medium throughan interface that may store methods, codes, and instructions asdescribed herein and elsewhere. The storage medium associated with theprocessor for storing methods, programs, codes, program instructions orother type of instructions capable of being executed by the computing orprocessing device may include but may not be limited to one or more of aCD-ROM, DVD, memory, hard disk, flash drive, RAM, ROM, cache and thelike.

A processor may include one or more cores that may enhance speed andperformance of a multiprocessor. In embodiments, the process may be adual core processor, quad core processors, other chip-levelmultiprocessor and the like that combine two or more independent cores(called a die). The methods and systems described herein may be deployedin part or in whole through a machine that executes computer software ona server, client, firewall, gateway, hub, router, or other such computerand/or networking hardware. The software program may be associated witha server that may include a file server, print server, domain server,internet server, intranet server and other variants such as secondaryserver, host server, distributed server and the like. The server mayinclude one or more of memories, processors, computer readable media,storage media, ports (physical and virtual), communication devices, andinterfaces capable of accessing other servers, clients, machines, anddevices through a wired or a wireless medium, and the like. The methods,programs or codes as described herein and elsewhere may be executed bythe server. In addition, other devices required for execution of methodsas described in this application may be considered as a part of theinfrastructure associated with the server.

The server may provide an interface to other devices including, withoutlimitation, clients, other servers, printers, database servers, printservers, file servers, communication servers, distributed servers andthe like. Additionally, this coupling and/or connection may facilitateremote execution of program across the network. The networking of someor all of these devices may facilitate parallel processing of a programor method at one or more location without deviating from the scope ofthe disclosure. In addition, any of the devices attached to the serverthrough an interface may include at least one storage medium capable ofstoring methods, programs, code and/or instructions. A centralrepository may provide program instructions to be executed on differentdevices. In this implementation, the remote repository may act as astorage medium for program code, instructions, and programs.

The software program may be associated with a client that may include afile client, print client, domain client, internet client, intranetclient and other variants such as secondary client, host client,distributed client and the like. The client may include one or more ofmemories, processors, computer readable media, storage media, ports(physical and virtual), communication devices, and interfaces capable ofaccessing other clients, servers, machines, and devices through a wiredor a wireless medium, and the like. The methods, programs or codes asdescribed herein and elsewhere may be executed by the client. Inaddition, other devices required for execution of methods as describedin this application may be considered as a part of the infrastructureassociated with the client.

The client may provide an interface to other devices including, withoutlimitation, servers, other clients, printers, database servers, printservers, file servers, communication servers, distributed servers andthe like. Additionally, this coupling and/or connection may facilitateremote execution of program across the network. The networking of someor all of these devices may facilitate parallel processing of a programor method at one or more location without deviating from the scope ofthe disclosure. In addition, any of the devices attached to the clientthrough an interface may include at least one storage medium capable ofstoring methods, programs, applications, code and/or instructions. Acentral repository may provide program instructions to be executed ondifferent devices. In this implementation, the remote repository may actas a storage medium for program code, instructions, and programs.

The methods and systems described herein may be deployed in part or inwhole through network infrastructures. The network infrastructure mayinclude elements such as computing devices, servers, routers, hubs,firewalls, clients, personal computers, communication devices, routingdevices and other active and passive devices, modules and/or componentsas known in the art. The computing and/or non-computing device(s)associated with the network infrastructure may include, apart from othercomponents, a non-transitory storage medium such as flash memory,buffer, stack, RAM, ROM and the like. The processes, methods, programcodes, instructions described herein and elsewhere may be executed byone or more of the network infrastructural elements.

The methods, program codes, and instructions described herein andelsewhere may be implemented on a cellular network having multiplecells. The cellular network may either be frequency division multipleaccess (FDMA) network or code division multiple access (CDMA) network.The cellular network may include mobile devices, cell sites, basestations, repeaters, antennas, towers, and the like. The cell networkmay be a GSM, GPRS, 3G, EVDO, mesh, or other networks types.

The methods, programs codes, and instructions described herein andelsewhere may be implemented on or through mobile devices. The mobiledevices may include navigation devices, cell phones, mobile phones,mobile personal digital assistants, laptops, palmtops, netbooks, pagers,electronic books readers, music players and the like. These devices mayinclude, apart from other components, a storage medium such as a flashmemory, buffer, RAM, ROM and one or more computing devices. Thecomputing devices associated with mobile devices may be enabled toexecute program codes, methods, and instructions stored thereon.Alternatively, the mobile devices may be configured to executeinstructions in collaboration with other devices. The mobile devices maycommunicate with base stations interfaced with servers and configured toexecute program codes. The mobile devices may communicate on a peer topeer network, mesh network, or other communications network. The programcode may be stored on the storage medium associated with the server andexecuted by a computing device embedded within the server. The basestation may include a computing device and a storage medium. The storagedevice may store program codes and instructions executed by thecomputing devices associated with the base station.

The computer software, program codes, and/or instructions may be storedand/or accessed on machine readable media that may include: computercomponents, devices, and recording media that retain digital data usedfor computing for some interval of time; semiconductor storage known asrandom access memory (RAM); mass storage typically for more permanentstorage, such as optical discs, forms of magnetic storage like harddisks, tapes, drums, cards and other types; processor registers, cachememory, volatile memory, non-volatile memory; optical storage such asCD, DVD; removable media such as flash memory (e.g. USB sticks or keys),floppy disks, magnetic tape, paper tape, punch cards, standalone RAMdisks, Zip drives, removable mass storage, off-line, and the like; othercomputer memory such as dynamic memory, static memory, read/writestorage, mutable storage, read only, random access, sequential access,location addressable, file addressable, content addressable, networkattached storage, storage area network, bar codes, magnetic ink, and thelike.

The methods and systems described herein may transform physical and/oror intangible items from one state to another. The methods and systemsdescribed herein may also transform data representing physical and/orintangible items from one state to another. The elements described anddepicted herein, including in flow charts and block diagrams throughoutthe figures, imply logical boundaries between the elements. However,according to software or hardware engineering practices, the depictedelements and the functions thereof may be implemented on machinesthrough computer executable media having a processor capable ofexecuting program instructions stored thereon as a monolithic softwarestructure, as standalone software modules, or as modules that employexternal routines, code, services, and so forth, or any combination ofthese, and all such implementations may be within the scope of thepresent disclosure. Examples of such machines may include, but may notbe limited to, personal digital assistants, laptops, personal computers,mobile phones, other handheld computing devices, medical equipment,wired or wireless communication devices, transducers, chips,calculators, satellites, tablet PCs, electronic books, gadgets,electronic devices, devices having artificial intelligence, computingdevices, networking equipment servers, routers and the like.Furthermore, the elements depicted in the flow chart and block diagramsor any other logical component may be implemented on a machine capableof executing program instructions. Thus, while the foregoing drawingsand descriptions set forth functional aspects of the disclosed systems,no particular arrangement of software for implementing these functionalaspects should be inferred from these descriptions unless explicitlystated or otherwise clear from the context.

Similarly, it will be appreciated that the various steps identified anddescribed above may be varied, and that the order of steps may beadapted to particular applications of the techniques disclosed herein.All such variations and modifications are intended to fall within thescope of this disclosure. As such, the depiction and/or description ofan order for various steps should not be understood to require aparticular order of execution for those steps, unless required by aparticular application, or explicitly stated or otherwise clear from thecontext.

The methods and/or processes described above, and steps thereof, may berealized in hardware, software or any combination of hardware andsoftware suitable for a particular application. The hardware may includea general purpose computer and/or dedicated computing device or specificcomputing device or particular aspect or component of a specificcomputing device. The processes may be realized in one or moremicroprocessors, microcontrollers, embedded microcontrollers,programmable digital signal processors or other programmable device,along with internal and/or external memory. The processes may also, orinstead, be embodied in an application specific integrated circuit, aprogrammable gate array, programmable array logic, or any other deviceor combination of devices that may be configured to process electronicsignals. It will further be appreciated that one or more of theprocesses may be realized as a computer executable code capable of beingexecuted on a machine readable medium.

The computer executable code may be created using a structuredprogramming language such as C, an object oriented programming languagesuch as C++, or any other high-level or low-level programming language(including assembly languages, hardware description languages, anddatabase programming languages and technologies) that may be stored,compiled or interpreted to run on one of the above devices, as well asheterogeneous combinations of processors, processor architectures, orcombinations of different hardware and software, or any other machinecapable of executing program instructions.

Thus, in one aspect, each method described above and combinationsthereof may be embodied in computer executable code that, when executingon one or more computing devices, performs the steps thereof. In anotheraspect, the methods may be embodied in systems that perform the stepsthereof, and may be distributed across devices in a number of ways, orall of the functionality may be integrated into a dedicated, standalonedevice or other hardware. In another aspect, the means for performingthe steps associated with the processes described above may include anyof the hardware and/or software described above. All such permutationsand combinations are intended to fall within the scope of the presentdisclosure.

While the disclosure has been disclosed in connection with the preferredembodiments shown and described in detail, various modifications andimprovements thereon will become readily apparent to those skilled inthe art. Accordingly, the spirit and scope of the present disclosure isnot to be limited by the foregoing examples, but is to be understood inthe broadest sense allowable by law. All documents referenced herein arehereby incorporated by reference.

What is claimed is:
 1. A method comprising: capturing an image indigital data with a camera; storing digital data of the image on anon-transitory computer readable medium; extracting gray values fromimage color channels of the image digital data; creating image pyramidsfrom the grey values and from depth data of the digital data; computinga scene fitness value using the image pyramids; predicting a camerapose; aligning the image with each element of a first subset of selectedkeyframes, yielding an aligned pose and also yielding a quantity ofposes and a corresponding quantity of overlap values with respect to theselected keyframes; computing a new camera pose estimate using thealigned image and the quantity of poses and the quantity of overlapvalues; creating a keyframe from the digital data of the image using thenew camera pose estimate when desired; after the keyframe is created,selecting a second subset of keyframes different from the first subsetof keyframes and repeating the step of aligning with each keyframe ofthe second subset of keyframes to yield a plurality of pose values andoverlap values; deciding for each element of the selected first subsetof keyframes and the selected second subset of keyframes whether newlinks are required to the keyframe in a keyframe pose graph; and linkingthe keyframes.